Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition build system of cuda_cccl and cuda_parallel to scikit-build-core #3597

Merged

Conversation

oleksandr-pavlyk
Copy link
Contributor

Description

closes gh-2334

Change build system of internal Python package cuda_cccl that installs CCCL headers into Python prefix.

The advantage of using scikit-build-core is that installation of headers is controlled by project's top-level "CMakeLists.txt"

A minimal pytest-based test suite is added as well.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Transition from using setuptools to using scikit-build-core, so
that CMakeLists.txt is used to ensure that include is the same
as in the cuda toolkit.

Added test folder to test that the package works correctly.
Use suggestion by Bradley to use scikit-build-core regex facility
to extract version value from "_version.py"
Copy link
Contributor

🟥 CI finished in 4m 36s: Pass: 0%/1 | Total: 4m 36s | Avg: 4m 36s | Max: 4m 36s
  • 🟥 python: Pass: 0%/1 | Total: 4m 36s | Avg: 4m 36s | Max: 4m 36s

    🟥 cpu
      🟥 amd64              Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    🟥 ctk
      🟥 12.6               Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    🟥 cudacxx
      🟥 nvcc12.6           Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    🟥 cxx
      🟥 GCC13              Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    🟥 cxx_family
      🟥 GCC                Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    🟥 gpu
      🟥 v100               Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    🟥 jobs
      🟥 Test               Pass:   0%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 1)

# Runner
1 linux-amd64-gpu-v100-latest-1

@leofang
Copy link
Member

leofang commented Jan 30, 2025

cc @bdice @vyasr for vis

Copy link
Contributor

🟥 CI finished in 6m 03s: Pass: 0%/1 | Total: 6m 03s | Avg: 6m 03s | Max: 6m 03s
  • 🟥 python: Pass: 0%/1 | Total: 6m 03s | Avg: 6m 03s | Max: 6m 03s

    🟥 cpu
      🟥 amd64              Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    🟥 ctk
      🟥 12.6               Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    🟥 cudacxx
      🟥 nvcc12.6           Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    🟥 cxx
      🟥 GCC13              Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    🟥 cxx_family
      🟥 GCC                Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    🟥 gpu
      🟥 v100               Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    🟥 jobs
      🟥 Test               Pass:   0%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 1)

# Runner
1 linux-amd64-gpu-v100-latest-1

Copy link
Contributor

🟩 CI finished in 47m 48s: Pass: 100%/1 | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
  • 🟩 python: Pass: 100%/1 | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 47m 48s | Avg: 47m 48s | Max: 47m 48s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 1)

# Runner
1 linux-amd64-gpu-v100-latest-1

1. Use pytest fixture to reuse computed inc_paths
2. Split long test into smaller individual tests with descriptive names
3. Add tests to check that directories contain expected marker files (versions)
4. Avoid using c4l in favor of cccl
Copy link
Contributor

🟩 CI finished in 45m 37s: Pass: 100%/1 | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
  • 🟩 python: Pass: 100%/1 | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 45m 37s | Avg: 45m 37s | Max: 45m 37s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 1)

# Runner
1 linux-amd64-gpu-v100-latest-1

@leofang
Copy link
Member

leofang commented Jan 31, 2025

Q: How much efforts is needed to also build cuda.parallel with skbuild-core? I think it was the main mission of #2334 because it contains the C library to be built with nvcc.

@oleksandr-pavlyk
Copy link
Contributor Author

Oh, you are right. This does not close gh-2334..

Transitioning cuda.paralallel to scikit-build-core requires decoupling the building of cccl-c library and making it an imported dependency. I should perhaps make this change in a separate PR, but the claim of closing gh-2334 needs to be removed

@oleksandr-pavlyk
Copy link
Contributor Author

It appears there is no issue to change cuda_cccl build system, and each PR must close an issue. So should one be created?

@oleksandr-pavlyk
Copy link
Contributor Author

Actually, I do not think this is ready yet @leofang

This PR adds an empty file cuda/__init__.py which clobbers the cuda/__init__.py from cuda-python package. This is not good. Also, multiple packages installing into the ${PREFIX}/lib/pythonX.Y/site-packages/cuda/ is not a good practice because of possibility of clobbering.

So cuda/__init__.py must be removed.

@leofang
Copy link
Member

leofang commented Jan 31, 2025

Good catch, yes I misread python/cuda_cccl/cuda/__init__.py and thought it's installed as cuda_cccl/cuda/__init__.py. This needs to be fixed. cuda is a namespace package (NVIDIA/cuda-python#75).

@leofang leofang self-requested a review January 31, 2025 15:31
@oleksandr-pavlyk
Copy link
Contributor Author

I pushed a change that removes cuda/__init__.py. I also changed build system for cuda_parallel to scikit-build-core as well. This time the change should indeed close gh-2334 :)

Copy link
Contributor

🟩 CI finished in 46m 01s: Pass: 100%/1 | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
  • 🟩 python: Pass: 100%/1 | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 46m 01s | Avg: 46m 01s | Max: 46m 01s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
+/- python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 1)

# Runner
1 linux-amd64-gpu-v100-latest-1

Copy link
Contributor

@shwina shwina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (but would also wait for Bradley & Leo to approve before merging)!

Can we change the title of this PR to:

Transition build system of cuda_cccl and cuda_parallel to scikit-build-core

Awesome work @oleksandr-pavlyk!

@oleksandr-pavlyk oleksandr-pavlyk changed the title Transition build system of cuda_cccl to scikit-build-core Transition build system of cuda_cccl and cuda_parallel to scikit-build-core Jan 31, 2025
@oleksandr-pavlyk
Copy link
Contributor Author

@leofang Overall, do we have tests that various python packages installing into site-packages/cuda do not clobber one another? This could be tested by inspecting RECORD files from their wheels.

@leofang
Copy link
Member

leofang commented Feb 2, 2025

Overall, do we have tests that various python packages installing into site-packages/cuda do not clobber one another? This could be tested by inspecting RECORD files from their wheels.

I think the Kitmaker team will eventually handle this test. For now we are on our own.

@leofang
Copy link
Member

leofang commented Feb 2, 2025

@NVIDIA/cccl-cmake-codeowners we still need another review here. There's no change to other projects CMakeList; new additions here are to serve the new build system (skbuid-core).

@shwina shwina merged commit c52fcd0 into NVIDIA:main Feb 3, 2025
21 of 24 checks passed
@oleksandr-pavlyk oleksandr-pavlyk deleted the transition-cuda-cccl-to-scikit-build-core branch February 3, 2025 22:49
@@ -0,0 +1,63 @@
import pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition!

Is this run in the CI?

(For cuda_parallel and cuda_cooperative the tests are run from ci/test_python.sh)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[ENH]: Improve/Rewrite cuda.parallel's build system
6 participants